Predicting Retrieval Quality for Resource Selection in Distributed Information Retrieval

نویسندگان

  • Henrik Nottelmann
  • Norbert Fuhr
چکیده

In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be passed. Most existing resource selection algorithms compute a library ranking in a heuristic way. In this paper, we follow a different approach on a better theoretic foundation. The decision-theoretic approach tries to minimise the overall costs of the distributed retrieval. Costs here include—beside retrieval quality—time and money. We present different methods for estimating the retrieval quality of a library. We explore the relationship between the probability of inference and the probability of relevance, and approximate indexing weights with a normal distribution. We also evaluate the different methods on a large test-bed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision-Theoretic Resource Selection for Different Data Types in MIND

In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be routed. In this paper, we describe a novel technique that is used in the MIND project. Our approach, decision-theoretic framework (DTF), differs from existing algorithms like CORI in two ways: It computes a selection which mi...

متن کامل

Adaptive Query-Based Sampling of Distributed Collections

As part of a Distributed Information Retrieval system a description of each remote information resource, archive or repository is usually stored centrally in order to facilitate resource selection. The acquisition of precise resource descriptions is therefore an important phase in Distributed Information Retrieval, as the quality of such representations will impact on selection accuracy, and ul...

متن کامل

Distributed Information Retrieval: A Multi-Objective Resource Selection Approach

Information retrieval is becoming increasingly concerned with resource selection and data fusion for distributed archives. In distributed information retrieval, a user submits a query to a broker, which determines a solution for how to yield a given number of documents from all available resources. In this paper, we present a multi-objective model for resource selection, in which four aspects: ...

متن کامل

Performance Evaluation of Medical Image Retrieval Systems Based on a Systematic Review of the Current Literature

Background and Aim: Image, as a kind of information vehicle which can convey a large volume of information, is important especially in medicine field. Existence of different attributes of image features and various search algorithms in medical image retrieval systems and lack of an authority to evaluate the quality of retrieval systems, make a systematic review in medical image retrieval system...

متن کامل

Evaluation of Resource Description Quality Measures

An open problem for Distributed Information Retrieval is how to represent large document repositories (known as resources) efficiently. To facilitate resource selection, estimated descriptions of each resource are required, especially when faced with non-cooperative distributed environments[1]. Accurate and efficient Resource description estimation is required as this can have an affect on reso...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002